scivisfandomcom-20200213-history
SBIR Q2 Report
Domain Specific Customisations Astro Simulation I/O Component We implemented a brick reader that is tailored to astro-physics simulation community needs. While there currently is a brick reader in ParaView called the "raw" reader, it is tailored to the medical imaging community's needs. For instance the raw reader makes use of C++ iostream library which is known to perform suboptimally on systems where parallel file systems are employed, the raw reader treats file series as stacks of images rather than a collection of time steps, and the raw reader handles vector data in an interleaved format single file format. For the astro simulation community our reader is an improvement over ParaView's raw reader in 3 specific ways. First, we used MPI-I/O functions for all disk access which is optimised for scalability and performance on parallel file systems. This is a key distinction as the astro-simulation community deals with much larger datasets located on systems that have parallel file systems. Second, our reader handles vectors where each component is written to a single brick as is often the case for scientific data. Third, our reader treats series of files as a collection of time steps, which is most often the case for astro simulation results. In the coming quarter we plan to develop a number of features that will increase performance and usability of the reader. For instance, we will implement a striding feature that allows the data to be subsampled from disk during the read. This will increase interactivity during preliminary investigation by reducing the reader's memory footprint. We plan to add xml configurability to the reader so that the reader can be simply made to understand output from a variety of simulations. We plan to expose a number of options to the user that enable fine tuning of performance parameters at run time. One example is the ability to configure the number of processes that will read from disk, such a feature will be useful when differentiating between a system that have many cores with a single pipe to disk and a system that has many cores all with pipes to disk. Finally we plan to benchmark the reader on systems that employ a parallel file system. Magnetic Field Topolgy Analysis Tools Late in July we began collaborating with NASA Godard scientist John Dorelli on a magnetic field topology analysis tool set for ParaView. The tools will help analyse magnetic field topology in number of existing global geomagnetic simulations. The tool set will consist of a number of parallel reader plugins as needed to open the data in ParaView, a field tracer plugin, a seed point analysys plugin that renders a visual representation of the global field topology, and a field null finder that allows for verification of the result. Dr. Dorelli will be aid in refinement of the algorithms developed. Addressing Usability Issues This section details a number of usability issues which we have discovered by using ParaView on the Lobo (LANL), Pleiades (NASA), Ranger (TACC), Spur (TACC), and nashi (UCSD) systems. This is is far from complete and growing. We plan to make decisions on which of these to address. Graphical Manipulators for Sources Sources are ParaView objects that generate data. There primary uses are as probes, for seed points to field line tracers, and for annotations. Many of the sources do not have graphical manipulator and default values that are not context related. A specific example of a source object with no graphical manipulator is the "plane source". It is extremely tedious to have to manipulate the plane source by hand through text entry boxes. The plane source is defined by an origin and two axis points, the current defaults are to fixed values that have no relation to the current visualisation. At the very least this source should have a manipulator similar too the "slice" filter, in addition to the the text entries. The manipulator could then be used for rough placement and visual feed back, and the text entries could be used for fine tuning. Solution develop a manipulator that allows for graphical positioning of the plane. Use upstream bounds to create reasonable default values. Note: The existing slice manipulator will not be suitable since it uses a center + normal for positioning while the plane source uses 3 points. Using 3 points differs in that a parallelogram is defined and bounds are implicit. I looked into using the existing slice manipulator and found that it's not clear which points to use when configuring the plane source. The plane source manipulator should consist of 3 points, and origin and two axis points, and some sort of handle for rotation. A specific example of a source with a difficult to use manipulator is the "point source". The point source manipulator is tedious to use as there is no way to constrain the movement of the point. It would be advantageous to have the option to constrain the movement to a plane ort on a curve for instance. Multiple Input Dialog Multiple input dialog is confusing and difficult to use often causes ParaView to crash. In 3.4 only the first input (called input) can select multiple items, while the second input (called source) cannot select multiple items. Attempting to to select multiple items on the second input , which is a common use case for the stream traces, results in a crash (3.4 only). Changing selection on the second input results in error messages and a crash. The pipeline preview is confusing, there should be a much simpler and clearer way to graphically represent the pipeline for filters with multiple inputs. The following error is reported frequently and preceeds the crashes: QAbstractItemModel::endRemoveRows: Invalid index ( 3 , 0 ) in model pqPipelineModel(0x2f93de0) No Method for Duplication of Pipeline Objects No way to duplicate a source. Say you have configured a plane source, it would be useful to be able to right click its icon in the pipeline browser and then have a context menu option to duplicate. No Method to Save and Restore Individual Pipeline Objects It would be very useful to be able to save and restore individual pipeline elements, such as sources. It would useful to be able to save and restore individual sources, independent of the application state. Often we need the same sources over and over, it would be a time-saver to define once save then reuse across sessions. Eg. Need to create a sphere source representing an inner ionosphere boundary in a simulation, the sphere source will have the same configuration across many runs. Better Progress Reporting Needed During pipeline execution progress is reported via the progress bar, however, progress reporting does not occur during network data transfers which when remote rendering is disabled leave the U.I. in an apparently hung state. Also for time consuming tasks during a pipeline update progress messages are few and far between, again leaving the G.U.I in an apparently hung state. To make matters worse, many internal operations do not report progress, such as network transfers and meta data transfers. Thee operation occur after the pipeline update and it is often difficult to recognise when these complete. In addition to adding progress reporting to PaarView internal operation that execute after the user's pipeline runs, we may display progress reports in a modal dialog. This will give positive indication that a job is finished. No Way to Cancel Long Operations There is currently no way to cancel long operations, for example saving an animation of a large time series. One must kill the client or server. It would be very useful to have the option to abort during the middle of such an operation. One solution may be to implement observer threads and posix signal handler to handle interruption. This issue is related to progress reporting issue. Performance and Scalability Issue of Stream Tracer The vtkDistributedStreamTracer is a serial implementation that suffers scalability issues. Its run time increases linearly proportional to the number of processes. Each seed point is processed serially, across each process one by one, only one process is ever active at any given instant. The algorithm needs to be parallelized to achieve scalability. Editing Volume Rendering Settings Issues Editing of the transfer function for volume rendering is near impossible when connected to remote server due to continuous synchronisation where by each mouse action can induce a server side render event. When volume rendering large remote data this feature is a major issue as each update can take on the order of 10s to 100s of seconds. Additionally the mechanism for defining the transfer function is clunky at best and lacks sophistication. Currently one must introduce points and drag them around a small area to define the transfer function. Desirable action such as setting a Gaussian about a specific point is impossible. It is very difficult to use the current transfer function editor however this is a critical task when volume rendering. Addressing Remote Interactivity Issues in ParaView Introduction We investigated complaints of poor interactivity when ParaView was run remotely between the user's home and the LANL cluster called Lobo1. We successfully reproduced the loss of interactivity running ParaView between a home system and Lobo by adding visual elements to the rendered scene. Initially interactivity was fine, however as visual elements were added we experienced the loss of interactivity. We determined that the loss of interactivity was independent of input data set and data set size and was not linked to a specific visualisation technique which enabled us to reproduce on UCSD cluster called Nashi using ParaView's internal sources and test recording mechanism. By instrumenting the ParaView client server image delivery sub system we confirmed that frame rates dropped dramatically as as visual elements were added into the scene. We have identified the compression scheme employed by the image delivery subsystem at the root of the remote interactivity issues. ParaView's image delivery subsystem employs a run length encoding scheme developed by Sandia labs called "Squirt". Like all run length encoding schemes Squirt is fast but doesn't deliver high compression ratios. We found that the compression ratio degraded quickly as visual elements were added into the scene. We compared Squirt compression scheme against two popular compression schemes, zlib which is used in the linux gzip tool, and szip which is employed in HDF5 library. The tests show that, at its highest setting, zlib's compression ratio's are an order of magnitude higher than Squirt's across the board, however run time is also an order of magnitude slower. However, benchmarks show that ParaView's frame rates are also an order of magnitude longer than the run times, so run time is not a significant factor here. In order to confirm the result we implemented a second suite of in-situ tests implementing zlib compression in ParaView such that when Squirt was turned off, zlib was turned on. This enabled us to benchmark ParaView suing both Squirt and Zlib side by side on real test data. The results show that in the cases where Squirt's frame rates diminish to non interactive rates, using zlib produces a speed up of a factor of 4.39 which, although modest, provides significantly better interactivity with image delivery times dropping from from on average 5.25 seconds down to 1.19 seconds per frame. To put this into context a 1.2 second average frame delivery time results in a jerky stinted interactions, while 5.2 average frame delivery rate is essentially unusable and non interactive. Testing on Ranger (TACC), Spur (TACC), and Pleiades (NASA) will be completed early in the next quarter. We plan to explore the application compression to geometry transfers, which due to there large size stand to benefit much more than the image transfers. Once testing is complete we will submit the modifications to Kitware for inclusion in the ParaView source tree. Results To motivate the planned modifications to ParaVierw's image delivery subsystem we ran to banks of tests. The first was a bank of stand alone command line tests that show the general efficacy of various compression schemes on typical rendered images. The second set is a bank of in place tests where ParaView's image delivery subsystem was modified so that the best compression alogrithm from the first bank of tests could be compared to ParaView's default compression, Squirt. The following subsections detail the procedures used and layout our results. A Benchmark of Squirt and Zlib Compression in ParaView For this suite of tests we implemented zlib compression in ParaView so that a side by side comparsion of zlib and Squirt could be made under real world conditions. In these in-situ tests we seek to isolate the the compression scheme as much as possible and to stress test the algorithms in a worst case scenario. To this end in the render-server configuration dialog (Edit->Settings...->Render View->{General,Server}) we set the Remote Render Threshold to 0, disabled Image sub-sampling and set the LOD Threshold to 0. We set the Squirt compressor to loss-less mode at 24 bpp and zlib compressor to its highest setting which is also its slowest. The results summarised in the following table and histogram. For each scheme two renderings were used. The first is an outline of a two dimensional dataset. This rendering has large areas of uniform color and compresses well with both Squirt and Zlib resulting in interactive frame rates for both. The second rendering of a fractal image has a lot of color variation and is used to confound the compression schemes. It represents a worst case input that will cause Squirt compression ration to fall so low that frame rates are not interactive. In this case Squirt compression ratio drops dramatically to 2.9 and frame rates fall below interactive rates, with each frame taking on average 5.25 seconds. Zlib on the other hand delivers a 16.06 times higher compression ratio than Squirt resulting in a 4.39 times faster frame delivery rate. The speed up factor of 4.39 is a modest improvement, however zlib's average delivery rate of 1.2 seconds per frame is a significant improvement in interactivity over Squirts average delivery of 5.25 seconds per frame. Test Results Dataset Outline Fractal Image Squirt Compression Ratio 33.5397 2.93691 Avg. Seconds per Frame 0.478120 5.257553 ZLib-9 Compression Ratio 313.838 47.1934 Avg. Seconds per Frame 0.247824 1.197212 Avg. Speed Up 1.92927 4.3915 Avg. Reduction Factor 9.3572 16.0690 Input Datasets The following images are screen shots taken from ParaView during the tests. We used a saved state file to insure the same initial conditions and a automated test script generated using ParaView's Tools->record Test feature to insure reproducability. Images used for Frame Rate Benchmark A General Comparison of Loss-less Compression Schemes on Images rendered by ParaView The following table sumarises the test sequence we implemented to compare Squirt the current compression algorithm used in ParaView to two potential alternatives, szip and zlib. Each scheme was tested using various compression levels. Two passes were made over each input image. The first pass the full 24 bit BMP was processed. In the second pass, Squirts level 5 color reduction algorithm was applied. This results in 10 bit color data. The reduced color data was then used as input to each of the schemes. We did this to show case Squirt at its best and to level the playing field accounting for the fact that the other schemes tested were loss-less. The purpose of the tests were to compare Squirt to various alternatives, there for we used Squirt's timing and compressed data size as a basis for relative comparisons. Collumns Relative Time Delta and Relative Compression Ratio contain the primary results, while other columns contain the raw data. Squirt is a run length encoding(RLE) scheme and can be set to loss-less or lossy compression. Squirt's lossy compression uses a color depth reduction technique to increase run length. Squirt's strong point is its speed, of the schemes tested it was the fastest. It's weakness is that it achieves a relatively low compression ratio. Szip is a scheme available by HDF4 and HDF5 and patented by NASA. Our tests show that for typical images rendered by paraView szip does relatively poorly compared to Squirt and Zlib. However it is intersting to note that on most challenging input used in the tests szip performs reasonably well. Zlib is public domain implementation of deflate the scheme originally used in pkzip, has fixed memory requirements independent of input data size, and essentially never inflates data. At its highest setting zlib achieved the highest compression ratios an order of magnitude higher than Squirt for all test inputs. The trade off, is as is often the case, speed as runtime for zlib compressor is an order of magnitude longer than Squirt. However our in-situ benchmarks show that compressor run time does not contribute significantly to frame rate. Decreasing the color depth of the input images further increased the compression ratio and the run time for the zlib compressor. The results suggest that it will likely be worth while to pre-process rendered images by reducing the color depth much as is done for Squirt in typical use. Test Results Input Dataset Case Number Scheme Name Final size Compression Ratio Relative Compression Ratio Time Delta Relative Time Delta 1_streams.bmp 24 bpp 1597995 1 * Squirt-0 204388 7.81844 1 0.0130591 1 2 szip-ec 1320347 1.21028 0.154799 0.0280352 2.14678 3 szip-nn 1261476 1.26677 0.162023 0.0361311 2.76673 4 zlib-1 63113 25.3196 3.23845 0.0157809 1.20842 5 zlib-2 59978 26.643 3.40772 0.0160151 1.22635 6 zlib-3 50819 31.4448 4.02188 0.0168622 1.29121 7 zlib-5 41633 38.3829 4.90928 0.024646 1.88726 8 zlib-9 23779 67.2019 8.59532 0.194409 14.8868 1_streams.bmp 10 bpp 1597995 9 * Squirt-0 204364 7.81936 1 0.0118198 1 10 szip-ec 1320211 1.21041 0.154796 0.0267529 2.26339 11 szip-nn 1269975 1.25829 0.16092 0.0354099 2.9958 12 zlib-1 63117 25.318 3.23786 0.0157561 1.33302 13 zlib-2 59992 26.6368 3.40652 0.0160511 1.35798 14 zlib-3 50714 31.5099 4.02974 0.016974 1.43606 15 zlib-5 41507 38.4994 4.9236 0.0246332 2.08405 16 zlib-9 23729 67.3435 8.61242 0.193784 16.3948 2_streams.bmp 24 bpp 1597995 17 * Squirt-0 419484 3.80943 1 0.0156109 1 18 szip-ec 1315757 1.21451 0.318816 0.028928 1.85306 19 szip-nn 1271709 1.25657 0.329858 0.036212 2.31965 20 zlib-1 124372 12.8485 3.37282 0.0213711 1.36898 21 zlib-2 117238 13.6304 3.57805 0.0222421 1.42478 22 zlib-3 105137 15.1992 3.98988 0.0261979 1.67818 23 zlib-5 90809 17.5973 4.61941 0.035774 2.2916 24 zlib-9 67194 23.7818 6.24288 0.573777 36.7548 2_streams.bmp 10 bpp 1597995 25 * Squirt-0 419460 3.80965 1 0.0147369 1 26 szip-ec 1313854 1.21627 0.319259 0.0280142 1.90096 27 szip-nn 1279024 1.24939 0.327953 0.0354748 2.40721 28 zlib-1 124316 12.8543 3.37414 0.0213621 1.44956 29 zlib-2 117225 13.6319 3.57825 0.0221038 1.49989 30 zlib-3 105045 15.2125 3.99315 0.0260451 1.76734 31 zlib-5 90760 17.6068 4.62164 0.0358241 2.43091 32 zlib-9 67153 23.7963 6.24633 0.574763 39.0017 B_highres.bmp 24 bpp 6912000 33 * Squirt-0 2380100 2.90408 1 0.0646422 1 34 szip-ec 1710716 4.04041 1.39129 0.0686312 1.06171 35 szip-nn 2090999 3.3056 1.13826 0.079762 1.2339 36 zlib-1 1711898 4.03762 1.39033 0.136663 2.11414 37 zlib-2 1704298 4.05563 1.39653 0.138913 2.14895 38 zlib-3 1693065 4.08254 1.40579 0.141505 2.18905 39 zlib-5 1676255 4.12348 1.41989 0.179624 2.77874 40 zlib-9 1652838 4.1819 1.44001 0.818812 12.6668 B_highres.bmp 10 bpp 6912000 41 * Squirt-0 1873368 3.68961 1 0.0599349 1 42 szip-ec 1701702 4.06182 1.10088 0.0674322 1.12509 43 szip-nn 2077745 3.32668 0.901635 0.0791159 1.32003 44 zlib-1 815701 8.47369 2.29664 0.117282 1.95682 45 zlib-2 775508 8.91287 2.41567 0.120061 2.00319 46 zlib-3 695275 9.94139 2.69443 0.137728 2.29796 47 zlib-5 601728 11.4869 3.11331 0.205056 3.42131 48 zlib-9 531795 12.9975 3.52273 0.857498 14.3072 Input Datasets The fololwing images were used as comand line parameters to our test code. Input Images used for the General Comparsion Notes # Lobo - 272 compute nodes, 4,352 CPU, 38 TFLOPS system. Quad-core, quad-socket AMD Opteron w/ Infiniband. # Nashi - 22 compute nodes, 88 CPU, 264 G Ram, Dual-core, dual-socket AMD Opteron w/ Infiniband. # http://www.zlib.net/ # http://www.hdfgroup.org/doc_resource/SZIP/ Source Code Management and Documentation We set up a subversion repository located at nashi-submaster.ucsd.edu:/data/nas0/svn. All work related to the SciVis SBIR is available there. We set up a Wiki at http://nashi-submaster.ucsd.edu/SciVisWiki for project documentation. Currently we are using a pulic wiki located at http://scivis.wikia.com, in the coming quarter we will move current documentation to the new wiki.